Generating Custom Code for Efficient Query Execution on Heterogeneous Processors
نویسندگان
چکیده
Processor manufacturers build increasingly specialized processors to mitigate the effects of the power wall to deliver improved performance. Currently, database engines are manually optimized for each processor: A costly and error prone process. In this paper, we propose concepts to enable the database engine to perform per-processor optimization automatically. Our core idea is to create variants of generated code and to learn a fast variant for each processor. We create variants by modifying parallelization strategies, specializing data structures, and applying different code transformations. Our experimental results show that the performance of variants may diverge up to two orders of magnitude. Therefore, we need to generate custom code for each processor to achieve peak performance. We show that our approach finds a fast custom variant for multi-core CPUs, GPUs, and MICs. Sebastian Breß DFKI GmbH and TU Berlin E-mail: [email protected] Bastian Köcher TU Berlin E-mail: [email protected] Henning Funke TU Dortmund E-mail: [email protected] Tilmann Rabl TU Berlin and DFKI GmbH E-mail: [email protected] Volker Markl TU Berlin and DFKI GmbH E-mail: [email protected] CPU Cores GPU Cores Heterogeneous Processor Chip Dedicated Accelerator (GPU, MIC) Interconnect Fig. 1 Modern processors expose heterogeneity in the form of heterogeneous cores located on the same processor chip or specialized accelerator cards.
منابع مشابه
Optimized Composition: Generating Efficient Code for Heterogeneous Systems from Multi-Variant Components, Skeletons and Containers
In this survey paper, we review recent work on frameworks for the high-level, portable programming of heterogeneous multi-/manycore systems (especially, GPU-based systems) using high-level constructs such as annotated userlevel software components, skeletons (i.e., predefined generic components) and containers, and discuss the optimization problems that need to be considered in selecting among ...
متن کاملAccelerating high-order WENO schemes using two heterogeneous GPUs
A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...
متن کاملA Code Optimization Framework for Performance Portability of GPU Kernels onto Custom Accelerators
The shift toward parallel computing has resulted into a growing interest in computing systems with heterogeneous processing modules. Reconfigurable devices are often employed in such heterogeneous systems due to their low power and parallel processing benefits. An important issue in the programmability of these systems is the need for a single programming interface. Recent works have leveraged ...
متن کاملA Backend Extension Mechanism for PQL/Java with Free Run-Time Optimisation
In many data processing tasks, declarative query programming offers substantial benefit over manual data analysis: the query processors found in declarative systems can use powerful algorithms such as query planning to choose high-level execution strategies during compilation. However, the principal downside of such languages is that their primitives must be carefully curated, to allow the quer...
متن کاملAccelerating control-flow intensive code in spatial hardware
Designers are increasingly utilizing spatial (e.g. custom and reconfigurable) architectures to improve both efficiency and performance in increasingly heterogeneous systems-onchip. Unfortunately, while such architectures can provide orders of magnitude better efficiency and performance on numeric applications, they exhibit poor performance when implementing sequential, control-flow intensive co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1709.00700 شماره
صفحات -
تاریخ انتشار 2017